Concatenated convolutional neural network model for cuffless blood pressure estimation using fuzzy recurrence properties of photoplethysmogram signals

Due to the importance of continuous monitoring of blood pressure (BP) in controlling hypertension, the topic of cuffless BP estimation has been widely studied in recent years. A most important approach is to explore the nonlinear mapping between the recorded peripheral signals and the BP values which is usually conducted by deep neural networks. Because of the sequence-based pseudo periodic nature of peripheral signals such as photoplethysmogram (PPG), a proper estimation model needed to be equipped with the 1-dimensional (1-D) and recurrent layers. This, in turn, limits the usage of 2-dimensional (2-D) layers adopted in convolutional neural networks (CNN) for embedding spatial information in the model. In this study, considering the advantage of chaotic approaches, the recurrence characterization of peripheral signals was taken into account by a visual 2-D representation of PPG in phase space through fuzzy recurrence plot (FRP). FRP not only provides a beneficial framework for capturing the spatial properties of input signals but also creates a reliable approach for embedding the pseudo periodic properties to the neural models without using recurrent layers. Moreover, this study proposes a novel deep neural network architecture that combines the morphological features extracted simultaneously from two upgraded 1-D and 2-D CNNs capturing the temporal and spatial dependencies of PPGs in systolic and diastolic BP estimation. The model has been fed with the 1-D PPG sequences and the corresponding 2-D FRPs from two separate routes. The performance of the proposed framework was examined on the well-known public dataset, namely, multi-parameter intelligent in Intensive Care II. Our scheme is analyzed and compared with the literature in terms of the requirements of the standards set by the British Hypertension Society (BHS) and the Association for the Advancement of Medical Instrumentation (AAMI). The proposed model met the AAMI requirements, and it achieved a grade of A as stated by the BHS standard. In addition, its mean absolute errors and standard deviation for both systolic and diastolic blood pressure estimations were considerably low, 3.05 ± 5.26 mmHg and 1.58 ± 2.6 mmHg, in turn.

www.nature.com/scientificreports/ The rest of the paper is organized as follows. In section "Fuzzy recurrence plot", the preliminaries of the FRP are described. The proposed methodology, including the architecture of the novel deep structure, is provided in "Materials and methods". After presenting the results, further interpretations are discussed in "Discussion" section. Finally, the paper is concluded in the last section.
Fuzzy recurence plots. Given a time series as an output of a complex system, an RP is easily constructed by embedding the time-series into the phase space and calculation of adjacent state distances. Let X = {x 1 , x 2 , . . . , x m } be an m-dimensional time-series data in which x i is i'th value of X . If the system trajectory is represented by r p , p = 1, 2, . . . , N consisting of N states, the distance between the adjacent states p 1 and p 2 are calculated as 40 : where R is N × N recurrence matrix known as RP and θ is the Heaviside (step) function. When the distance between two adjacent states p i and p j in an RP are less than a predefined similarity threshold ( ε) , the corresponding element in the recurrence matrix becomes one. The recurrence matrix can be visualized as a symmetric matrix by assigning black and white dots to R p i ,p j = 1 and R p i ,p j = 0 , respectively. The main challenge in the visual representation of RPs is the appropriate selection of ε . In other words, in a dynamical system, the visualization of recurrence patterns in RP is significantly varied when different values of similarity threshold are considered 37 .
In the fuzzy version of RPs which are called FRP, Pham introduced a conversion of the time-series into the textural images based on a fuzzy relation measuring the similarity between two states in the reconstructed phase space 37,38 . Such fuzzy relation alleviates the challenge of threshold selection and also provides an enhancement in the visualization of the image's textures in comparison with the conventional RPs. An FRP is also a N × N matrix whose elements are arranged to represent the distance between the states by µǫ{0, 1} . Let V = {v 1 , v 2 , . . . , v c } be the set of fuzzy clusters of the states 37 . The fuzzy membership function ( µ ) characterizes a fuzzy relation R from X to V as a fuzzy set of X × V . The fuzzy membership function with the following properties indicates the strength of the relationship of each pair of (x i ,v k ) in R:

Transitivity.
where c is the number of fuzzy clusters. Herein, the fuzzy clusters of the state space are obtained by fuzzy c-means (FCM) algorithm determining the closeness between the states and the cluster centers. Subsequently, the similarity between the pairs of states is inferred using the max-min composition of a fuzzy relation. By minimizing the following objective function, the FCM algorithm categorizes the state space into c overlapping clusters 37,41 . where ω is the fuzzy weighting exponent ( ω = 2 in this study), Z = {z 1 , z 2 , . . . , z c } , z k , and d(x i , z k ) are the vector of the cluster centers, the center of the k th cluster, and inner-product-induced norm metric, respectively. In addition, the matrix of fuzzy c-partition is denoted by U = µ i,k . To minimize the fuzzy objective function of Eq. (5) an iterative updating process based on Eqs. (6) and (7) is applied until a stopping criterion is reached.
The stopping criterion is U(t) − U(t + 1) ≤∝ , where t is the t'th time step and ∝ is a small positive number indicating the level of accuracy. Compared with the conventional RPs, an FRP produce gray-scale images that are more advantageous than the black-white RPs 41 . Figure 1 represents the FRPs associated with the various states www.nature.com/scientificreports/ of the systolic and diastolic BPs. As seen, the recurrent and periodical nature of PPG signals are well represented in the corresponding FRPs.

Materials and methods
In our method, the PPG signal is the primary input of our model. Each signal is a vector converted to a 2-D FRP, and the former signal and the later plot simultaneously are given to the model as input because our model consists of two 1D_CNN and 2D_CNN and needs two inputs, as it is depicted in Fig. 2. The 1D_CNN extracts some valuable features based on the PPG time sequences, and the 2D_CNN also does the same but is based on    [42][43][44] . In addition, the research had no therapeutic implication and all data were de-identified to approve patient's confidentiality. The mean and standard deviation of ages of people who were monitored in this database were 61.6 ± 14.6 years. This database's physiological waveform records include multiple channels of simultaneously recorded signals (ECG, PPG, and arterial blood pressure), as well as time series of vital signals such as heart rate, systolic blood pressure (SBP), and diastolic blood pressure (DBP) digitized at 125 Hz. The labels (targets) for each case are SBPs and DBPs, which are acquired by an invasive blood pressure monitoring device. Since the dynamics of the input signal (PPG) and its relationship to blood pressure are changing at different times, separate processes have been used to estimate SBP and DBP.
The proposed neural network model. CNNs are one of the first choices for researchers trying to solve complex image processing-related problems because of their outstanding results for feature learning and estimation problems. They could find the relationship between what is given to them as an input and the desired output by adjusting their layers' weights. CNNs are formed by various layers computing convolutional transforms feeding following nonlinear and pooling operators. Researches have shown that results could be improved when fused images are used from multiple sources 45 . Also, instead of image fusion, a combination of two or even more models can result in performance enhancement.
In this study, we have proposed three different CNN networks, namely, 1D_CNN, 2D_CNN, and Con-cat_CNN models. 1D_CNN gets a PPG signal and predicts the BP based on the input signal, and 2D_CNN has the same operation, but its input is the 2-D plot produced by the FRP function. The third one (Concat_CNN) is a combination of two other networks. The reason we do so is that the feature maps computed in final layers for both 1D_CNN and 2D_CNN models are useful for determining blood pressure, but each of them evaluated their input from different prospects. Thus, the combination of those features before determining the output creates a type of information fusion resulting in the performance enhancement. CONCAT_CNN is a novel two-stream CNN network receiving feature maps extracted from two different 1-D and 2-D data sources as inputs. In the following sections, we first introduce the preliminaries of 1-D and 2-D CNN models utilized in this study. Then, the structure of the proposed concatenated model is presented.
1D_CNN. The human visual cortex was the main motivation for the invention of CNNs because they were regarded as simple computational models. They are immensely used in computer vision problems such as different sorts of detection, recognition, and augmentation. An adjusted version of 2-D CNNs, so-called 1-D CNNs, can have revolutionized the way we process 1-D biomedical signals. They remarkably eased the complexity of convolutional calculations because of the dramatically fewer number of parameters and input dimensions.
As a general finding, the majority of the researchers have used less hidden CNN layers for their 1-D CNNs architecture in comparison with 2-D CNNs. Therefore, it facilitated the process of training and implementing because of less number of unknown parameters, even fewer than 1 out of 10 times. Apart from the difference between the number of their parameters, while specific and powerful graphic processing units (GPUs) should be used to train 2-D images otherwise, it takes ages, 1-D CNNs could be trained by a normal central processing units (CPUs). Also, because of the number of parameters, 2-D CNNs require an abundance of samples to be well-trained, but even a limited labeled dataset is enough for training a 1-D CNN in most cases.
Over the forward propagation, the computation output map of a layer is the input of the following layer then this input will be convolved with the particular kernels, as bellow: where s l−1 i and w l−1 ik stands for the output and kernel of the ith neuron at layer l − 1 , respectively, b l k is the bias of the k th neuron at layer l , and x l k is the input of layer l . Each neuron of a middle layer has an output y l k which is computed output of the input x l k by the activation function.
where s l k is the output of the k th neuron of the layer l , and ↓ ss is the down-sampling operation of the ss meaning scalar factor.
The architecture of our adopted 1D_CNN is depicted in Fig. 3, receiving PPG signal samples as inputs. The number of filters is mentioned in the architecture, and the kernel size of all convolutional layers is 25. Following those layers, which extracted useful features from the input signal, there are fully connected layers. Extracted feature maps are given to them to estimate desired output by passing inputs through their fully connected neurons. This model consists of 30 consecutive layers, including convolutional, batch normalization, rectified linear unit (ReLU), average pooling dropout, flatten, input, and output layers. Selecting the number of layers, the dimensions of the filters, and how they are arranged next to each other can greatly affect the performance of the network. www.nature.com/scientificreports/ Therefore, it is necessary to obtain a suitable model of early architecture from the models used in valid studies.
To select the CNN architecture used in this study, articles that have already been done on the same database in estimating blood pressure have been considered. Specifically, in 31 , a multi-stage structure consisting of convolutional layers along with LSTM layers has been used. In the present paper, its CNN structure was considered as the initial model. However, due to the fact that this model alone does not produce a favorable result in estimating blood pressure, the modified structure is introduced in Fig. 3. This structure has been obtained by trial and error on a set of validation data. As seen, the proposed architecture consists of 1-D convolutional layer (Conv1D), batch normalization layer, ReLU activation function, average pooling, dropout, and fully connected layers.
2D_CNN. The second model which is proposed in our study is represented in Fig. 4. Herein, 2-D FRP images sized 89 × 89 feeding this CNN model. The processes and functions are almost similar to the previous model, but the convolution operations are 2-D instead of 1-D. As seen, there are 7 convolutional layers followed by batch normalization and ReLU layers. Next to all ReLU layers except the last two, an average pooling layer is provided to decrease the dimensionality of feature maps. There are fully connected layers to produce output which are given scalar values by those feature extractor layers. The backpropagation algorithm is used to train networks modifying layers' weights by calculating the gradient of all parameters.
Concat_CNN. The main model we proposed in this paper is a mixed model from 1 and 2D_CNN introduced before. The reason why we do this is that the combined model will be equipped with the recurrence information of the input patterns without applying the formal recurrent models. Therefore, if we use that complementary information, it improves the result. In other words, the third model named Concat_CNN fuses the extracted features from the previous models, and it results in better information processing. Figure 5 represents our Concat_CNN architecture comprising parallel 1-D and 2-D CNNs followed by the concatenating layer. Indeed, convolutional streams in 1D_CNN and 2D_CNN models produce feature maps then their fusion provides the input of the fully connected layers. In the training procedure, both 1D_CNN and 2D_CNN coefficients are tuned simultaneously as a compact single network model. As seen, the proposed 2-stream network will be trained just like a single model in each epoch, a PPG segment and its corresponding 2-D FRP image fed to the Concat_CNN to modify the unknown weights with regards to their target. There are 4 different architectures for fusing the two-stream network each of them and their formulas are investigated below. Fusion function inputs are x a t ∈ R H×W×N and x b t ∈ R H ′ ×W ′ ×N ′ and its output is y t ∈ R H ′′ ×W ′′ ×N ′′ . H , W and N represent the height, weight, and number of channels of the feature maps, in turn 45 .
Sum fusion. y sum = f sum x a , y b add the two feature maps at same location i, j , and feature channels n . The computed value at point i, j, n : Max fusion. y max = f max x a , y b gives the maximum of the two input feature maps as the output:  www.nature.com/scientificreports/ Concatenation fusion. y cat = f cat x a , y b concatenate the two input feature maps at same location: where y ∈ R H×W×2N .
Conv fusion. y conv = f conv x a , y b initially stack two input feature maps at the same location i, j across the feature channels n as equations presented in the previous paragraph afterward convolves the stacked data with a bank of filters f ∈ R 1×1×2N×N and biases b ∈ R D : where the number of output is N . There are a large number of parameters in deep networks, which make them complex to be trained, and it is even worse when the architecture consists of two streams. However, adding fusion layers can remarkably decrease the number of parameters in a deep network. In this paper, we employed the third method (concatenation fusion) because compared with others in this way, the number of parameters does not reduce very much. Although in some cases, reducing the number of parameters is an advantage, in our model, those features which are fused are critical for computing the output, so losing them should be avoided.

Results
The whole steps of this experiment were carried out on Colab, which provides GPU with 12 GB of random access memory (RAM) and around 30 GB free space for loading dataset. In addition, we have used Matplotlib library in Python for drawing the figures related to the results. The dataset was 200 records chosen from MIMIC-II. They split into three training, validation, and testing parts, 70%, 10%, and 20% of each record's length, respectively. PPG doesn't linearly relate to blood pressure in the same way for all people because the cardiovascular dynamics of different people are varying. Figure 6 depicts the model training error showing that the network reached maximum accuracy from 200 to 300 epochs during the training step. After that, it has started to become over fitted. All models trained with a batch size of 100 and Adam optimizer with 0.001 learning rate.
All three models were evaluated in terms of different measures, namely mean absolute error (MAE), mean squared error (MSE), and mean error (ME), for evaluation of the difference between the target and the output of each model. Moreover, the coefficient of determination (R 2 ), Pearson's correlation coefficient ( R ), and standard deviation (STD) have been calculated to determine the estimation accuracy. All formulas of the measures are presented below:  www.nature.com/scientificreports/ Y i is observed one which is provided in the dataset, Y i represents the predicted one by the model and Y is the mean of the Y i for i = 1, . . . , n . Also, the dataset was split into three subsets which are equal to 35,956, 10,574, and 5354 segments for training, validation, and testing, respectively. Table 1 shows that the Concat_CNN model achieved 1.58 ± 2.60 mmHg and 3.05 ± 5.26 mmHg for MAE and STD of systolic and diastolic BP, respectively. MAE for this model is by around 0.2 and 0.9 mmHg lower than the other models. Note that SBP error is higher because of the fact that their values are inherently bigger than DBP, so it causes greater error but an almost equal R 2 score. Figure 7 shows regression plots for all three models, which is significantly more convergence to the targets. Figure 8 represents the box and whisker plots of the delta between the output of each model and its desired target. Both these figures illustrate the fact that the proposed two-stream concatenated model could overcome the www.nature.com/scientificreports/ performance of the other models. Figure 9 also demonstrates the Bland Altman diagram of systolic and diastolic blood pressure outputs of the two-stream CNN model. Our Concat_CNN model was assessed using the British Hypertension Society (BHS) and the Association for the Advancement of Medical Instrumentation (AAMI) standard, which is presented in the next two paragraphs in detail.

Model evaluation based on British Hypertension Society (BHS) standard.
Based on the assessment criteria of the British hypertension society standard 46 , Table 2 shows the performance of the suggested method. The BHS has established standards for determining the accuracy of blood pressure monitors. Various degrees of dependability are ascribed to techniques and equipment for measuring blood pressure based on these criteria, which are determined by establishing different thresholds of error and calculating the cumulative error percentage for all estimated samples. Table 2 shows the grading system, which includes grades A, B, and C. As can be observed, if the cumulative frequency of error is less than 5 mmHg for 60% of the samples while also being greater than 85% and 95% for 10 mmHg and 15 mmHg levels, respectively, the monitoring technique is classified as grade A. Grades B and C will be given if the performance is less than the specified state (see Table 1). The approach presented in this article obtained an A grade based on the results of the measurements.
Model evaluation based on AAMI standard. Output results were evaluated using AAMI standard, which is illustrated in Table 3. These results met all criteria because ME was fewer than 5 mmHg, STD was lower  www.nature.com/scientificreports/ than 8 mmHg, and the total number of subjects was more than 85, in both diastolic BP and systolic BP. It shows the fact that this method is entirely reliable.

Discussion
In general, the results verify the superiority of deep architectures in BP estimation. It improves the importance of applying nonlinear mapping between peripheral signals and blood pressure values. Previously, PPG signals or their combination with ECGs showed a strong correlation with BP 7,9,31 . However, in this study, we have just   www.nature.com/scientificreports/ applied one source of peripheral signals in the estimation process. The PPG segments were time-series sequences made it necessary to use 1-D convolutional layers.
In addition, other studies approved the improvement of the performance of 1-D architectures when LSTM layers were adopted [29][30][31]47 . The results of Table 1 indicate the 1D_CNN, in turn, has yielded relatively good outcomes. In particular, in terms of R, it resulted in 0.93 for both SBP and DBP estimation. As seen, compared with the other studies, LSTM layers that model the recursive properties of the time series are of great importance for improving the results.
As a novel approach, we sought to evaluate the recursive characteristics of the PPG segments without enlarging the network structure. Therefore, instead of using LSTM layers, which create large multi-stage structures, we used a specific type of 2-D input FRP, which has the ability to model the pseudo-periodic behavior of PPG segments. Table 1 did not appear noticeable results for 2D_CNN by itself. However, the combination of 1D_CNN and the FRP could outperform the other two approaches. Specifically, in our proposed Concat_CNN model, MSE was decreased from 9.34 to 6.86 mmHg for DBP estimation and from 40.53 to 27.98 mmHg for SBP estimation. Moreover, R 2 was increased from 0.87 to 0.91 for DBP estimation and from 0.87 to 0.92 for SBP estimation. As a result, by taking away the LSTM layers, FRP provided a 2-D image for revealing spatial and recurrence-based properties of PPG segments in BP estimation. Figure 9 is devoted to the Bland-Altman plots for SBP and DBP. The X-axis shows the average of BPs, which are the mean of the estimations and the actual BP values. The difference between the estimated SBP and DBP and their actual values are also depicted in Y-axis. The mean of errors (bias) and the limits of agreement ( bias ± 1.96 × standard deviation (SD) ) are exhibited in dotted lines verifying more than 95% of the points placed within the limits of agreements. This finding is consistent with the results reported in Table 3, satisfying the BHS standard may not lead to the extraction of discriminative features. Table 4 compares the results of our proposed framework with the previous studies, including traditional methods and deep architectures. The traditional methods compared here are Adaboost, random forest, and dynamical modeling. This table demonstrates that deep learning models could overcome the traditional methods in terms of R and STD. Overall, deep networks are reliable models for BP estimation because of their ability to recognize discriminative features with a huge number of parameters. Our comparative analysis also showed that the introduced framework outperformed the methods applied on the MIMIC-II dataset. The finding implies that the proposed combined model more effectively distinguishes between different conditions of BPs. In particular, the correlation coefficient (R) was 0.16, 0.08, and 0.01 greater than LSTM, Res-LSTM, and Multistage Deep Neural Network (NN), respectively, which applied LSTM layers in their structure. Therefore, the lack of LSTM employment in our proposed Concat_CNN did not cause any disruption in adding time-related information to the model. In the 2D_CNN, the size of the input FRP image may influence the approximation ability of the model. The larger the input image size, the longer the network learning time. Also, the small size of the input image may not lead to the extraction of discriminative features.This will directly affect the performance of the final Concat_CNN model. Therefore, we have applied input FRP images with different sizes of 60 × 60, 88 × 88, and 100 × 100 to the Concat_CNN, and the results are presented in Table 5. As seen, although the results are not very sensitive to image size, the lowest error values and the highest R 2 were achieved when the FRP image size was set to 88 × 88 verifying the optimal performance.  www.nature.com/scientificreports/ Although the combination of chaotic nonlinear approaches and CNN architectures demonstrated feasible outcomes, there are some doubts while the suggested method proved its usefulness. First, the dataset was a part of the MIMIC-II database, and the results could not be fairly compared with the studies carried out on the other versions of the BP monitoring data sources. Second, we figured out that the 2D_CNN could not lonely outperform the traditional methods. Therefore, despite the use of neural networks with different architectures, employing various handcrafted features can help improve the results. Third, many varied factors like suffering from an illness, taking medicines, and so on can change the BP variations, rendering our method's results uncertain. But in the method we used, the effects of these factors were not accessible, and they had their impact on the input, and we used the same influenced data to train our model. As a potential solution, we can consider this issue as a static variable and sensitize our model to the type of participant.
Cuff-based measurement is not a proper choice for long-term monitoring of blood pressure, because of the existence of significant times between its successive BP recordings. In addition, invasive BP measurement needs catheterization, and it is not widely used for ambulatory conditions. Therefore, recent studies concentrate on the estimation of BPs through powerful soft computing approaches such as deep neural structures based on PPG and ECG. Since pseudo-periodic characteristics of PPG is of high significance in BP estimation, the usage of recurrence-based architectures is beneficial in the estimation frameworks. In this study, the recurrence properties of PPG signals are included in the model by employing their nonlinear features in phase-space. In other words, fuzzy recurrence plots are properly adopted to explore the pseudo-periodic nature of PPGs. Moreover, FRP provided 2-dimensional recurrence images to include spatial characterization of inputs to the estimation model. On the other hand, a novel double route concatenated convolutional neural network model concatenating the morphological and recurrence properties of PPGs from two separate lines is presented in our study. Our results indicated promising outcomes when sequence-based and recurrent features of PPGs were adopted in the proposed model. Although the proposed framework could overcome the methods reviewed in the literature, some disruptive factors like the subject's illness or their medication should be considered in future studies. Also, the proposed network model could be equipped with recurrent-link architectures like long-short term memory layers compared with the pure convolutional neural network models. Since our proposed methodology concentrates on applying deep neural networks in continuous BP estimation, the gradient-based training algorithm were adopted. However, as future work our approach can be verified by the most representative metaheuristic learning-based optimization algorithms (LIOA) 49 , such as monarch butterfly optimization (MBO) 50 , earthworm optimization algorithm (EWA) 51 , elephant herding optimization (EHO) 52 , moth search (MS) 53 algorithm, and colony predation algorithm (CPA) 54. compares the results of our proposed framework with the previous studies, including traditional methods and deep architectures. The traditional methods compared here are Adaboost, random forest, and dynamical modeling. This table demonstrates that deep learning models could overcome the traditional methods in terms of R and STD. Overall, deep networks are reliable models for BP estimation because of their ability to recognize discriminative features with a huge number of parameters. Our comparative analysis also showed that the introduced framework outperformed the methods applied on the MIMIC-II dataset. The finding implies that the proposed combined model more effectively distinguishes between different conditions of BPs. In particular, the correlation coefficient (R) was 0.16, 0.08, and 0.01 greater than LSTM, Res-LSTM, and Multistage Deep Neural Network (NN), respectively, which applied LSTM layers in their structure. Therefore, the lack of LSTM employment in our proposed Concat_CNN did not cause any disruption in adding time-related information to the model.
In the 2D_CNN, the size of the input FRP image may influence the approximation ability of the model. The larger the input image size, the longer the network learning time. Also, the small size of the input image may not lead to the extraction of discriminative features. This will directly affect the performance of the final Concat_CNN model. Therefore, we have applied input FRP images with different sizes of 60 × 60, 88 × 88, and 100 × 100 to the Concat_CNN, and the results are presented in Table 5. As seen, although the results are not very sensitive to image size, the lowest error values and the highest R 2 were achieved when the FRP image size was set to 88 × 88 verifying the optimal performance.
Although the combination of chaotic nonlinear approaches and CNN architectures demonstrated feasible outcomes, there are some doubts while the suggested method proved its usefulness. First, the dataset was a part of the MIMIC-II database, and the results could not be fairly compared with the studies carried out on the other versions of the BP monitoring data sources. Second, we figured out that the 2D_CNN could not lonely outperform the traditional methods. Therefore, despite the use of neural networks with different architectures, employing various handcrafted features can help improve the results. Third, many varied factors like suffering from an illness, taking medicines, and so on can change the BP variations, rendering our method's results uncertain. But in the method we used, the effects of these factors were not accessible, and they had their impact on the input, and we used the same influenced data to train our model. As a potential solution, we can consider this issue as a static variable and sensitize our model to the type of participant.
Cuff-based measurement is not a proper choice for long-term monitoring of blood pressure, because of the existence of significant times between its successive BP recordings. In addition, invasive BP measurement needs catheterization, and it is not widely used for ambulatory conditions. Therefore, recent studies concentrate on the estimation of BPs through powerful soft computing approaches such as deep neural structures based on PPG and ECG. Since pseudo-periodic characteristics of PPG is of high significance in BP estimation, the usage of recurrence-based architectures is beneficial in the estimation frameworks. In this study, the recurrence properties of PPG signals are included in the model by employing their nonlinear features in phase-space. In other words, fuzzy recurrence plots are properly adopted to explore the pseudo-periodic nature of PPGs. Moreover, FRP provided 2-dimensional recurrence images to include spatial characterization of inputs to the estimation model. On the other hand, a novel double route concatenated convolutional neural network model concatenating the morphological and recurrence properties of PPGs from two separate lines is presented in our study. Our www.nature.com/scientificreports/ results indicated promising outcomes when sequence-based and recurrent features of PPGs were adopted in the proposed model. Although the proposed framework could overcome the methods reviewed in the literature, some disruptive factors like the subject's illness or their medication should be considered in future studies. Also, the proposed network model could be equipped with recurrent-link architectures like long-short term memory layers compared with the pure convolutional neural network models. Since our proposed methodology concentrates on applying deep neural networks in continuous BP estimation, the gradient-based training algorithm were adopted. However, as future work our approach can be verified by the most representative metaheuristic learning-based optimization algorithms (LIOA) 49 , such as monarch butterfly optimization (MBO) 50 , earthworm optimization algorithm (EWA) 51 , elephant herding optimization (EHO) 52 , moth search (MS) 53 algorithm, and colony predation algorithm (CPA) 54 .